Integrative omics analyses of the ligninolytic Rhodosporidium fluviale LM-2 disclose catabolic pathways for biobased chemical production

Background Lignin is an attractive alternative for producing biobased chemicals. It is the second major component of the plant cell wall and is an abundant natural source of aromatic compounds. Lignin degradation using microbial oxidative enzymes that depolymerize lignin and catabolize aromatic compounds into central metabolic intermediates is a promising strategy for lignin valorization. However, the intrinsic heterogeneity and recalcitrance of lignin severely hinder its biocatalytic conversion. In this context, examining microbial degradation systems can provide a fundamental understanding of the pathways and enzymes that are useful for lignin conversion into biotechnologically relevant compounds. Results Lignin-degrading catabolism of a novel Rhodosporidium fluviale strain LM-2 was characterized using multi-omic strategies. This strain was previously isolated from a ligninolytic microbial consortium and presents a set of enzymes related to lignin depolymerization and aromatic compound catabolism. Furthermore, two catabolic routes for producing 4-vinyl guaiacol and vanillin were identified in R. fluviale LM-2. Conclusions The multi-omic analysis of R. fluviale LM-2, the first for this species, elucidated a repertoire of genes, transcripts, and secreted proteins involved in lignin degradation. This study expands the understanding of ligninolytic metabolism in a non-conventional yeast, which has the potential for future genetic manipulation. Moreover, this work unveiled critical pathways and enzymes that can be exported to other systems, including model organisms, for lignin valorization. Supplementary Information The online version contains supplementary material available at 10.1186/s13068-022-02251-6.

Background Lignocellulosic biomass is the most abundant and lowcost source of fermentable sugars and building blocks for the production of biofuels and value-added chemicals, which makes it an attractive alternative to fossil fuels [1]. It is primarily composed of cellulose, hemicellulose, and lignin. While cellulose and/or hemicellulose can be utilized to produce biofuels and chemicals, the use of lignin has been limited to energy supply. Lignin is the second major component of plant cell walls, imparting structural stability to plant tissues and fibers, and forming a barrier against microbial infections [2]. It consists of phenylpropanoid monomeric guaiacyl (G), p-hydroxyphenyl (H), and syringyl (S) units randomly interconnected mainly by β-O-4 aryl ether bonds [3]. Considering that 70 million tons of lignin is extracted annually during pulping operations [4], lignin is an abundant natural source of aromatic compounds. However, lignin valorization requires efficient methods for the degradation and conversion of complex mixtures of ligninderived aromatic compounds into bioproducts. Some microorganisms have been described as being capable of bioconverting lignin into value-added compounds, such as vanillin [5], polyester precursors [6,7], and even nylon [8], indicating their usefulness for lignin valorization.
In the second step, bacteria, fungi, and yeasts assimilate aromatic compounds through funneling pathways, converting the molecules into key central metabolic intermediates or target compounds [10,16]. Despite the diverse array of oxidized fragments originating from lignin degradation, catabolism proceeds mainly via two major compounds, protocatechuate and catechol (both of which are cleaved to form central intermediates) [9,16]. For instance, during the degradation of G-unit rich lignin, ferulic acid is converted into vanillin by feruloyl-CoA synthetase (FCS) and feruloyl-CoA hydratase-lyase (FCHL). Vanillin is then cleaved into protocatechuate (vanillin pathway), and the latter is converted into central intermediates (pyruvate, acetyl-CoA, and succinate) through three different pathways, namely the protocatechuate 4,5-cleavage, protocatechuate 2,3-cleavage, and β-ketoadipate pathways [10].
Although lignin degradation is mainly attributed to white-rot fungi and bacteria, oleaginous yeasts have been reported to be capable of degrading lignin and further assimilating the lignin-derived aromatic compounds. Rhodosporidium sp. can modify wheat straw and Sarkanda grass [23] and consume p-coumaric, p-hydroxybenzoic, and ferulic acids [24,25]. Additionally, species of this genus have high tolerance toward inhibitors generated during the pretreatment of lignocellulosic biomass [26]. Even though some model microorganisms such as Saccharomyces cerevisiae and Escherichia coli, have been engineered for lignin valorization [27,28], non-model microbes have also been continually characterized to unveil novel ligninolytic pathways with biotechnological relevance. For instance, Rhodosporidium toruloides is genetically and physiologically well characterized, providing vital knowledge for further strain engineering [29].
In this context, a combination of omics approaches was used to determine the genetic potential and physiology of a novel Rhodosporidium fluviale strain LM-2 isolated from a lignin-degrading microbial consortium [30] (Fig. 1). Genomic analysis revealed several genes involved in lignin degradation and aromatic catabolism, and transcriptomic and secretomic analyses elucidated the metabolism of yeast grown in lignin-containing medium. To exploit the biotechnological potential of R. fluviale for the production of compounds of interest, this novel strain was cultivated in media containing ferulic acid. Afterwards, the resulting metabolites were identified by UHPLC-MS/MS to determine possible pathways for ferulic acid catabolism in this yeast. Combining these results and omics approaches, biocatalytic pathways that

Isolation and identification
In a previous study, Moraes and collaborators (2018) isolated and identified several bacteria and yeasts in the LigMet microbial community, a lignin-degrading consortium developed by growing cultures on low-molecularweight (LW) lignin [30]. For isolation of microorganisms, the culture broth from LigMet was diluted and plated on agar supplemented with LW lignin and high-molecular-weight (HW) lignin with glucose (HW + G) [30]. For the analysis described herein, the yeast strain LM-2 was selected, which was capable of growing on LW and HW + G plates (data not shown) and was highly tolerant to different kraft lignin concentrations (Additional file 1: Fig. S1).
To identify the LM-2, the ITS1 and ITS2 regions were sequenced, and a BLAST search was performed against the GenBank nucleotide (nonredundant) database. The analysis revealed that LM-2 shared 100% and 98% sequence identity with R. fluviale and Rhodosporidium azoricum, respectively (Table 1), and clustered together with R. fluviale DMKU RK253 [31] based on phylogenetic tree construction using ITS regions (Additional file 2: Fig. S2). This indicated that LM-2 was a novel R. fluviale strain and was therefore named R. fluviale LM-2.
Physiological tests based on cell viability at different temperatures have been used to distinguish between R.  [30]. B) Omic approaches were used to characterize the genome, gene expressions, and secreted proteins related to lignin degradation. C) R. fluviale LM-2 cells were cultivated in a ferulic acid-containing medium to identify the active catabolic pathways for this compound  Fig. S3), corroborating the result of species determination by phylogenetic analyses.

Gene expression and production of ligninolytic enzymes
Transcriptome and secretome analyses were combined to investigate the ability of R. fluviale LM-2 to degrade lignin and metabolize lignin-derived aromatic compounds. Transcriptomic data contained an average of 25 million and 17 million reads for replicates with kraft lignincontaining medium and glucose control, respectively. RNA-seq analysis identified 15,986 distinct transcripts, of which 3618 were differentially expressed during cultivation on lignin-containing medium (p ≤ 0.05), including 1657 upregulated (Log 2 fold change ≥ 1) and 1961 downregulated (Log 2 fold change ≤ 1) transcripts (Additional file 6: Fig. S6). The Log 2 fold change ranged from 8.5 to − 9.1, and the top ten upregulated and downregulated genes by fold change are listed in Additional file 9: Table S3. The top ten upregulated genes consisted mainly of hypothetical proteins and dehydrin, which is a stress response protein (DHN family protein). On the other hand, the top ten downregulated genes included sugar transporters and proteins identified from the expansin family which include cell wall modification proteins [38]. Table 3 summarizes the upregulated genes encoding enzymes related to lignin depolymerization, including heme peroxidases (AA2), β-etherase, and CYP. No upregulated genes related to ferulic acid pathway II or  protocatechuate 2,3-cleavage pathway were identified under the conditions analyzed. Therefore, under the analyzed conditions, the transcriptomic analyses indicated that R. fluviale LM-2 catabolizes ferulic acid preferentially through ferulic acid pathway I, followed by the vanillin, protocatechuate 4,5-cleavage, and β-ketoadipate pathways (Fig. 4). A finding that has indirect implications for lignin degradation, is that the stress response enzyme catalase (Table 3) and 45 genes of 229 major facilitator superfamily (MFS) transporters predicted in the R. fluviale LM-2 genome (Additional file 8: Table S2) were also upregulated during cultivation on lignin-containing medium.
Collectively, gene expression and production of ligninolytic enzymes analysis confirmed that R. fluviale LM-2 can carry out the first and second steps of the lignin degradation, producing H 2 O 2 , which is the substrate of peroxidases and several enzymes involved in the conversion of phenolic compounds into central metabolic intermediates.

Bioconversion of ferulic acid into 4-VG
Ferulic acid is the major hydroxycinnamic acid recovered from plant biomass and has a broad spectrum of antibacterial, anti-inflammatory, and antioxidant activities [39]. In addition, this phenolic compound is an important precursor for the production of high value-added chemicals such as vanillin and 4-VG [40], which are aromatic compounds used to impart vanilla flavor and clove aroma to food products, respectively.
To evaluate the ability of R. fluviale LM-2 to convert ferulic acid into 4-VG and vanillin, R. fluviale LM-2 cells  were cultivated in minimal medium with and without ferulic acid. Capillary electrophoresis of the extracellular fluid indicated that ferulic acid was not degraded spontaneously during cultivation (negative control: minimal medium with ferulic acid), and R. fluviale LM-2 completely consumed this compound within 24 h (Fig. 6B).   Secondly, ferulic acid and its possible conversion products (4-VG and vanillin) were detected intracellularly by mass spectrometry after 12 h of cultivation ( Fig. 6C and Additional file 10: Table S4). For the latest analysis, the results of R. fluviale LM-2 cultivated with ferulic acid were compared with the results of R. fluviale LM-2 cultivated without ferulic acid. Therefore, mass spectrometry analysis validated the presence of the two metabolic pathways for ferulic acid conversion predicted from the genomic analysis (Additional file 5: Fig. S5): one based on the sequential action of FCS and FCHL to produce vanillin (ferulic acid pathway I), and the other based on the decarboxylation of ferulic acid to produce 4-VG (ferulic acid pathway II), catalyzed by PDC.

Discussion
To our knowledge, the genomic analysis of R. fluviale LM-2 in this study is the first one for this species, and as with other species from this genus, R. fluviale is capable of accumulating lipids (up to 50-70% of their dry weight) [41] and of tolerating inhibitory compounds [42]. For example, R. fluviale DMKU-RK253, which was isolated after enrichment of sugarcane leaf samples [41], accumulated high lipid levels after cultivation in glycerol containing medium [31]. Comparative genomic analysis revealed that R. fluviale LM-2 harbors a relatively large genome (50 Mb), with a larger set of predicted protein-coding genes (17,565) than R. toruloides (20.2 Mb, containing 8171 protein-coding genes) [43]. Genome size variation is a typical adaptive response in fungi to adapt to a specific habitat or ecological niche and involves genome duplication and translocation [44]. For instance, the genome of Phanerochaete carnosa shows a tandem duplication of ligninolytic genes compared to P. chrysosporium [45]. Furthermore, duplication of these genes could confer competitive advantage to ligninolytic organisms in nature, compared to other organisms that have less tolerance to the toxicity of aromatic compounds [46]. The large size of R. fluviale LM-2 genome was also observed based on de novo transcriptome assembly using Trinity, which was about 44 Mb with a completeness of 80%.
Multi-omic analysis of R. fluviale LM-2 elucidated a repertoire of genes, transcripts, and secreted proteins involved in lignin degradation, as well as the ability to convert lignin-derived aromatics into vanillin and 4-VG. R. fluviale LM-2 expresses heme peroxidases (AA2), β-etherases, and CYPs for lignin depolymerization, as well as several enzymes with a Pfam domain related to aromatic degradation. In contrast to the Rhodotorula sp. R2 [23], which secretes enzymes that act directly on lignin macromolecules, the AA2 enzyme from R. fluviale LM-2 was not secreted under the conditions analyzed.
However, R. fluviale LM-2 appears to perform the first step of lignin degradation by secreting GLOX-AA5_1. Although aromatic compound metabolism by yeast is an uncommon phenotype, for example, S. cerevisiae INVSc1 Invitrogen ™ uptakes low levels of mono-aryl compounds for further metabolism [47], oleaginous yeasts such as R. fluviale LM-2 have been extensively studied in this context since the products of aromatic metabolism (acetyl-CoA and pyruvate) are fatty acid biosynthesis precursors. For instance, Yaguchi and collaborators (2020) screened 36 yeast strains cultivated with several aromatic compounds, in which each species presented a unique metabolism and tolerance profile to these compounds [48]. Additionally, it is important to mention that the work described here focused on the secreted enzymes for lignin depolymerization. However, the whole proteomic analysis would also be helpful to improve the coverage of lignin-inducible genes in this yeast.
Collectively, the data indicated that during lignin catabolism, genes involved in stress response were upregulated in R. fluviale LM-2, probably in response to the toxicity of aromatic compounds. For example, enzymes responsible for controlling intracellular ROS production, such as catalases, were overexpressed in the presence of lignin. In addition, DNH was also overexpressed under the conditions analyzed. DHN has been characterized as a stress enzyme in plants, and is involved in membrane protection, cryoprotection of enzymes, and protection from reactive oxygen species [49,50]. Beyond its demethylation role, CYP is also an essential component of the stress response system [51] and could therefore play a role in the adaptation of R. fluviale LM-2 to support ligninolytic pathways.
Among the transporters, MSF was upregulated in response to growth in lignin-containing medium, indicating its importance in lignin catabolism in R. fluviale LM-2. Members of this superfamily transport various small compounds, including aromatic compounds, across biological membranes [52]. Moreover, the overexpression of this transporter has been described as crucial for increasing protocatechuate conversion in Sphingobium sp. strain SYK-6 [53].
The biocatalytic conversion of ferulic acid can be useful for the production of desired chemicals [46,54,55]. Ferulic acid is usually converted to vanillin by FCS and FCHL with ATP consumption in two steps [56,57]: (i) CoA-thioesterification of ferulic acid by FCS, and (ii) hydration of feruloyl-CoA by FCHL. The alternative catabolic route of ferulic acid through the 4-VG pathway is a detoxification process involving non-oxidative decarboxylation driven by the cofactor-free enzyme PDC [20]. Sequences coding a FCS, a FCHL, and a PDC were identified in the R. fluviale LM-2 genome based on Pfam domain analyses. This finding and the detection of vanillin and 4-VG after cultivation with ferulic acid indicated that these two ferulic acid catabolic pathways are functional in R. fluviale LM-2. Furthermore, similar to R. toruloides IFO0880 [24], R. fluviale LM-2 consumed all the ferulic acid added to the culture media before 24 h of growth.

Conclusion
The present study shows that R. fluviale LM-2 possesses a wide spectrum of enzymes involved in lignin and phenylpropanoid degradation, which can be useful for lignin valorization strategies. Therefore, these results suggest that R. fluviale LM-2 could not only be classified as a ligninolytic yeast, but also as a degrader of polycyclic aromatic hydrocarbons and heterocyclic aromatic pollutants. In summary, the omics-based characterization of R. fluviale LM-2 opens new opportunities for biotechnological applications of this yeast. The availability of genomic data can support the genetic manipulation of this yeast and the development of lignin valorization strategies. In addition, this work uncovered functional ligninolytic pathways and novel genes, including FCS, FCHL, and PDC enzymes, that can be exported to other systems, such as model organisms, in biotechnology for the production of biofuels and bioproducts.

Isolation and identification
The yeast strain was identified in a lignin-degrading microbial consortium established on acidified black liquor generated from delignification of steam-exploded sugarcane bagasse (LW lignin) [30]. The yeast strain was isolated on separated agar plates containing: 1:1 (v/v) LW lignin; 0.25% (w/v) HW lignin and 0.1% (w/v) glucose (HW + G) [30]. The strain's tolerance to different concentrations of kraft lignin was analyzed using a spot plating assay, in which several dilutions (10 −1 -10 −7 ) of the cells were plated on agar supplemented with kraft lignin at four concentrations of 1%, 0.5%, 0.25%, and 0.125%. For species identification, total DNA was extracted using the Fast DNA ® Spin Kit for Soil (MP-Biomedicals, Irvine, CA, USA), according to the manufacturer's instructions. The quality and concentration of the extracted DNA were evaluated using 1.0% (m/v) agarose gel electrophoresis and by measuring the absorbance at 260 nm using a NanoDrop ® 2000c spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). The sequences of the hypervariable internal transcribed spacer ITS1 and ITS2 regions were amplified by PCR using the following primers: ITS3 forward (GCA TCG ATG AAG AAC GCA GC) and ITS4 reverse (TCC TCC GCT TAT TGA TAT GC). The resulting sequences were compared with those in the GenBank nonredundant nucleotide sequence database using the BLASTn algorithm [58].

Genome sequencing and assembly
For genome sequencing, paired-end and mated-pair libraries were constructed using the Nextera XT DNA Sample Prep Kit and Nextera ® Mate Pair Sample Preparation Kit (Illumina, San Diego, CA, USA), respectively. Libraries were quantified and quality checked using the KAPA library quantification kit (Merck, Darmstadt, Germany) and Bioanalyzer high-sensitivity DNA chips (Agilent, Santa Clara, CA, USA), and then sequenced on an Illumina MiSeq Platform using 2 × 300 bp, according to the manufacturer's instructions.
Illumina reads of different sizes were first filtered to remove adapters and low-quality reads using NextClip [59] and Trimmomatic 0.32 [60] using default settings. The genome was de novo assembled using Velvet 1.2.10 [61] and SSPACE [62] was used for scaffolding using the mated-pair reads. Pilon [63] was used to further improve genome assembly. Gene calling was performed using the Maker pipeline [64] using Augustus [65] and SNAP [66]. Genome completeness was assessed through BUSCO v2 [67].
Although R. fluviale LM-2 has a high tolerance to different concentrations of kraft lignin and can metabolize phenolic compounds, as described in the manuscript, the yeast could not grow well in liquid media with lignin as the only carbon source (data not shown). Thus, to proceed with the transcriptome analysis, it was added to the culture media 0,1% glucose consumed in less the 4 h by the yeast (data not shown). Cells were harvested by centrifugation at 4000 rpm for 10 min, and total RNA was preserved using TRIzol ® (Thermo Fisher Scientific, Waltham, MA, USA), followed by extraction and purification using the RNeasy Plant MiniKit (QIAGEN, Hilden, Germany). The quantity and quality of the extracted RNA were determined using a NanoDrop ® 2000c spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) and a 2100 Bioanalyzer platform (Agilent, Santa Clara, CA, USA), with a minimum RIN (RNA integrity number) of 7.0 [74].
Libraries were prepared using a TruSeq ® Stranded Total RNA Library Prep Kit (Illumina, San Diego, CA, USA). Quality and quantity were determined using capillary electrophoresis on a 2100 Bioanalyzer platform (Agilent, Santa Clara, CA, USA) and the KAPA Library Quantification Kit for Illumina (Merck, Darmstadt, Germany), respectively. The libraries were sequenced using the Illumina HiSeq 2500 platform (Illumina, San Diego, CA, USA), according to the manufacturer's instructions.
Reads were preprocessed as described previously for the genome libraries, and evaluation and filtration of the rRNA were performed using SortmeRNA. The filtered data were mapped against the R. fluviale LM-2 reference genome sequenced in this study using the Tophat2 algorithm [75]. Differential gene expression analysis was based on counting data and was performed with R using the Bioconductor DESeq2 package [76] through paired comparisons against the control (medium lacking lignin). Transcripts showing differential expression (log2-fold change ≥ 1 and ≤ -1) relative to the control were determined using p ≤ 0.05 as the threshold. A volcano plot was generated using R scripts with the log2-fold change value as the input.

Growth conditions for secretome analysis
After overnight preculture in YPD broth (1% yeast extract, 2% peptone, and 2% glucose), the cells were harvested by centrifugation (4,000 rpm, 10 min), washed with 30 ml of PBS, and inoculated into a flask containing 100 ml yeast nitrogen base (YNB) with amino acids (Sigma-Aldrich-Y1250) supplemented with 0.1% kraft lignin and 0.1% glucose to an initial OD 600 of 0.1. After 24 h, the cultures were centrifuged (4000 rpm, 10 min), and the supernatants were filtered through 0.45 μm and 0.2 μm MF-Millipore ® membrane filters (Merck, Darmstadt, Germany) to remove residual cells. Protein content was concentrated using Vivaspin 20 ultrafiltration spin columns (Sartorius Stedim, Gottingen, Germany) with a molecular mass cutoff of 3 kDa, and quantified using the Bradford assay (BioRad ® , Hercules, CA, USA) [77]. The proteins were separated by 10% SDS-PAGE and the protein bands were excised and analyzed by mass spectrometry.

Secretome mass spectrometry analysis and data processing
For secretome analysis, aliquots of 25 µg of the concentrated supernatant were subjected to SDS-PAGE in triplicates, and the protein bands were excised and analyzed using Micro LC-MS/MS QTof XEVO G2 XS equipment (Waters, Milford, MA, USA) at the Life Sciences Core Facility (LaCTAD, UNICAMP, Campinas, SP, Brazil). The columns were equilibrated with 93% mobile phase A (0.1% formic acid in water) and 7% mobile phase B (0.1% formic acid in acetonitrile) at 40 ℃. Peptides were separated from the C18 Trap column (Waters, Milford, MA, USA) by gradient elution (7% to 40% acetonitrile) on an ACQUITY UPLC M-Class HSS T3 analytical column (Waters, Milford, MA, USA).
Data-independent acquisition (MSE) was carried out by operating the instrument in positive ion V mode, applying the MS and MS/MS functions over 0.5 s intervals with 6 V low energy and 15-45 V high energy collision, to obtain the peptide mass to charge ratio (m/z) and product ion information, for deducing the amino acid sequence. The capillary voltage and source temperature were set to 3.0 kV and 80 ℃, respectively. To correct the mass drift, the internal mass calibrant leucine enkephalin (556.2771 Da) was infused every 30 s through a lock spray ion source at a flow rate of 3 µL/min. Peptide signal data were collected between 100 and 2000 m/z values.
Proteins present in the samples were identified through comparison with the protein sequences previously predicted in the genome analysis, and by setting the minimum number of fragment ion matches per peptide and protein to three and five, respectively. The false positive discovery rate (FDR) was set at 4%. The FDR for peptide and protein identification was determined based on the search of a reversed database, which was generated automatically using ProteinLynx Global SERVER ™ (PLGS) software (Waters, Milford, MA, USA), by reversing the sequence of each entry. All protein hits were identified at a confidence level of > 95%. Raw data processing and protein identification were performed using the ProteinLynx Global SERVER 3.0.3 (Waters, Milford, MA, USA).

Yeast cultivation and secondary metabolite extraction
After overnight preculture in YPD broth, yeast was cultivated in minimal medium YNB with amino acids supplemented with 0.1% glucose with and without 1.25 mM ferulic acid (Sigma-Aldrich, 128,708) to an initial OD 600 of 0.1. The medium without cells was used as a control for compound degradation. Cultures were sampled after 6, 12, 24, 48, 72, and 96 h for further capillary electrophoresis (for extracellular fluid) and mass spectrometry (for intracellular fluid).